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VTOEO ENCODER WITH LOW COMPLEXITY NOISE REDUCTION 

The invention provides a method for performing noise reduction inside of a video 
compression encoder with low incremental complexity. The motion estimation portion of 
a video encoder is additionally used for die noise reduction process, with multiple motion 
estimation decision sets stored so that multiple predictors may be used in motion- 
compensated temporal filtering. 



Noise reduction prior to video encoding can improve the quality of video encoding of a 
noisy video sequence at a given bitrate. However, the best noise reduction techniques are 
very computationally complex to implement. In the invention, a noise reduction function 
can be added to a video encoder with very low inciemental complexity. 



It is well understood that noisy video sequences are more difficult to compress using 
standard video compression techniques, than are clean video sequences at a given bitrate. 
Noise reduction can be applied as a pre-processing function applied prior to video 
compression, as shown in Figure 1. In such a system, noise reduction is applied to a 
• sequence of input pictures, creating a sequence of noise reduced pictures. The noise 
reduced pictures are then encoded using a video encoder, creating a compressed video 
bitstream. 

Spatial and/or temporal filtering have been used in prior noise reduction methods. 
Temporal filtering involves applying a filtering function, such as an average, to the pixels 
from several different input pictures to create filtered pixels. Temporal filtering of video 
sequences generally falls into one of two categories, motion compensated or non-motion 
compensated. For video sequences containing motion, the use of motion compensated 
methods generally outperforms non-motion compensated methods. Motion compensated 
methods however are generally the most computationally expensive of the classes of 
methods. 



Most video compression standards (MPEG, H.263, H.264) use motion estimation and 
compensation in the encoding process. 

The H.264 video compression system (also referred to as JVT or MPEG AVC) uses tree- 
structured hierarchical macroblock partitions. Inter-coded 16x16 pixel macroblocks may 
be broken into macroblock partitions, of sizes 16x8, 8x16, or 8x8. Macroblock partitions 
of 8x8 pixels are also known as sub-macroblocks. Sub-macroblocks may be further 
broken into sub-macroblock partitions, of sizes 8x4, 4x8, and 4x4. An encoder may 
select how to divide the macroblock into partitions and sub-macroblock partitions based 
on the characteristics of a particular macroblock, in order to maximize compression 
efficiency and subjective quality. 

Multiple reference pictures may be used for inter-prediction, with a reference picture 
index coded to indicate which of the multiple reference pictures is used. In P pictures (or 
P slices), only single directional prediction is used, and the allowable reference pictures 
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are managed m list 0. In B pictures (or B slices), two lists of reference pictures are 
managed, list 0 and list 1 . In B pictures (or B slices), single directional prediction using 
eiflier hst 0 or list 1 is allowed, or bi-prediction using both list 0 and list 1 is allowed 
When iM-prediction is used, the list 0 and the list 1 predictors are averaged together to 
form a final predictor. e uiw w 

Figure 2 shows a normal videoiencoding system block diagram for use with H 264 or 
similar video compression systems. For an H.264 encoder, the motion estimation process 
inputs are the input video sequence pictures and the previous coded pictures, which are 
stored in the reference picture stores. For each macroblock in a current picture, the 
motion estimatton process compares the cuirent macroblock with some pre-determined 
number of reference pictures. A macroblock mode, which indicates the breakdown of the 
macroblock into the various partitions sizes, is output for each macroblock. For each 
macroblock partition, a reference picture index is output For each macroblock partition 
or sub-macroblock partition, a motion vector is output The motion estimator has 
considerable freedom to decide what are the best macroblock mode, reference picture 
indices and motion vectors for a macroblock. witfi the goal to create a good prwlictor for 
ttie current picture, so that the current picture may be encoded efRcientiy. Once these 
decisions are made in tiie motion estimation process, a predictor is formed in tiie motion 
ocmpensation process, and tiie predictor is subtracted from tfie input picture, to create a 
difference picture. The difference picture is coded using a block transform, quantizer 
and entropy coder. Inverse quantization and inverse transform are applied, and the ' 
coded/decoded picture is stored in the reference pictiire stores for use in die coding of 
later pictures. ^ 
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In this invention, the motion estimation function of a video encoder is also used to perform 
noise reduction. The incremental complexity of performing noise reduction as part of a video 
encoder is very small compared to that of a standalone video noise reduction system. For 
noisy video sequences, this invention can significantly improve the compressed video quaUty 
at a particular bitrate as compared to a normal video encoder. 

This invention can be used with any block-based motion compensation video compression 
system. However the best results are for a compression systems like H.264 that use multiple 
reference pictures, because the motion estimation functions for the multiple' reference pictures 
can be re-used in both the encoder and noise reducer, allowing multiple pictures to be used in 
the noise reduction filtering process. 

Figure 3 shows a video encoder with noise reducer, in accordance with the present invention. 
Similar to the prior art system in Figure 2, the motion estimation process inputs are the input 
video sequence pictures and tfie previous coded pictures, which are stored in the reference 
picture stores. However, instead of ou^utting a single best macroblock mode for the 
macroblock, a reference picture index for the macroblock 

partition and motion vector for a macroblock partition or sub-macroblock partition, in 
accordance witii the current invention the output is the best N sets of (Mode, RefPicIndex, 
and MV) for flie partitions and sub-macroblock partitions of the macroblock, referred to as 
motion estimation decision sets. 

Figure 4 shows a flow chart of the noise reduction process for the pictuie to be coded, in 
accordance with the current invention- The macroblocks in the picture are looped with loop 
index mb. For each macroblock, motion estimation is performed, with N motion estimation 
decision sets stored. Then, noise reduction is applied to the macroblock, using the stored N 
motion estimation decision sets. The noise reduction process is described in more detail 
below. Then video encoding of the macroblock is performed. First, the motion 
compensation process creates a predictor for the macroblock using the first of the N stored 
motion estimation decision sets, which is considered to be the best of tfie sets. This prediction 
is subtracted from the filtered picture, with the difference picture transformed, quantized, and 
entropy coded, and tiien inverse quantized and inverse transformed and stored in the 
reference picture stores. 

In one embodiment of the present invention, the N motion estimation data sets are required to 
each use a different reference picture index for each macroblock partition. 

The N motion estimation decision sets are input to the noise reduction process. Figure 5 
shows a flowchart of the noise reduction process. Each pixel in the block is looped through, 
with loop index p. Each of the N motion estimation decision sets are looped through, with 
look index i. For each i, a predictor, pred[i], is formed for tiie pixel by performing motion 
compensation using the i-th motion estimation decision set. A difference measure is 
computed which compares the values of ttie current pixel pic[p) wifli the predictor, predii]. 
This difference measure may include luma and/or chroma values in the calculation. An 
example difference measure is the absolute difference value. If the difference measure is 
below a threshold, the predictor is added to the filtering set, fset, to be used in the noise 
reduction filtering operation. 



V 

I 



After all N motion estimation data sets have been threshold tested to form the complete filter 
set, fset, a filtering operation is performed on fset. The filtering operation is separately 
performed on luma samples and on associated samples of both chroma components. Any of 
several different filter functions may be used in the noise reduction filtering operation, such 
as computing an average, a weighted average, or a median. The filtering operation may also 
include spatial neighbors in the computation. The spatial neighbors may also be compared 
with a threshold to consider whether to include the spatial neighbors in the filtering operation. 
The result of the pixel filtering operation is placed in the filtered picture, as Filtjpic[p]. The 
filtered picture, Filt j)ic is then used as the input to the rest of die video encoding process. 



Figure 3 illustrates a particular embodiment of the current invention wheie the filtered 
pictures are stored in filtered picture stores, and used as the inputs to the noise reduction 
process when noise reducing later pictures. Alternatively, the original input pictures of the 
reference picture stores may be used as inputs to the noise reduction process. 

For macroblocks in intra (I) pictures, spatial-only filtering may be performed. Alternatively, 
the motion estimation and noise reduction processes described earlier may be performed, but 
with the video encoding portion performing intra-only encoding, and hence not making use of 
the motion estimation decision set chosen in the motion estimation set. For a hardware 
encoder, there is little additional complexity involved in performing motion estimation on an 
I picture, as the existmg motion estimation components already exist and would otherwise be 
unused. 

In an embodiment of the current invention, spatial filtering may be applied to the input 
pictures prior to the motion estimation process. Figure 6 illustrates a system where spatial 
filtering is applied to input pictures, prior to encoding and motion estimation. For I pictures, 
motion estimation is not used, and the input to the encoding process is selected to be the 
spatially filtered input pictures. For P and B pictures, motion estimation is performed using 
the spatially filtered input pictures as input. 



Possible Claims: 

L Encoder and noise reducer that share the same motion estimation function. 

2. Storing more than one motion estimation decision set from the motion estimation 
function of the encoder, to be used by a motion compensated temporal filtering 
operation. 

3. Use of threshold in decision whether or not to include a predictor in the noise 
reduction filtering opemtion. 

4. Use previously coded pictures as references in motion estimation, but applying 
motion estimation decisions (motion vectors, etc.) to filtered pictures for noise 
reduction. 

5. Applying spatial filtering to the input pictures before motion estimation which is used 
for motion-compensated temporal filtering. 
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Figure 1. Standard Video Encoder 
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Figure 2. Standard Video Encoder 
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Figure 3. Video Encoder w/ Noise Reducer 
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Figure 4. Howchart of Ehcoder/Nolse Reducer 
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Figure 6. Video Encoder w/ Noise Reducer 
and Spatial Filtering 



